Attorney Docket 01376.730usl 

What is claimed is: 



1 1 . A computerized method comprising: 

2 providing a first vector of addressing values; 

3 providing a second vector of operand values; 

4 storing a first sequence of values to a sequence of addressed locations within 

5 a constrained area of memory, wherein each location's address is based at least in 

6 part on a corresponding one of the addressing values; 

7 reading back from the sequence of addressed locations values resulting from 

8 the storing of the first sequence to obtain a second sequence of values; 

9 comparing the first sequence of values to the second sequence of values to 

10 generate a bit vector representing compares and miscompares; 

1 1 compressing the second vector of operand values using the bit vector; 

12 using the first vector of addressing values as masked by the bit vector, 

1 3 loading a third vector register with elements from memory; 

14 performing an arithmetic-logical operation using values from the third vector 

15 register and the compressed second vector of operand values to generate a result 

16 vector; and 

1 7 using the first vector of addressing values as masked by the bit vector, storing 

1 8 the result vector to memory. 

1 2. The method of claim 1 , wherein addresses of the elements in memory are 

2 calculated by adding each respective addressing value to a base address of an object 

3 in memory. 

1 3. The method of claim 1 , wherein the arithmetic-logical operation is an 

2 addition operation that produces at least one element of the result vector as a 

3 summation of an element of the loaded third vector register and a plurality of 

4 respective elements of the original second vector of operand values corresponding to 

5 elements of the first vector of addressing values that had identical values. 
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1 4. The method of claim 1 , wherein address values for the sequence of addressed 

2 locations within the constrained area of memory are each calculated using a 

3 truncated portion of each respective addressing value of the first vector of addressing 

4 values. 

1 5. The method of claim 4, wherein data values of the first sequence of values 

2 are each formed by concatenating a portion of each respective addressing value of 

3 the first vector of addressing values to a respective one of a sequence of numbers. 

1 6. The method of claim 1 , wherein the constrained area of memory includes 2 N 

2 locations, wherein address values for the sequence of addressed locations within the 

3 constrained area of memory are each calculated by adding a base address to an N-bit 

4 portion of each respective addressing value of the first vector of addressing values, 

5 and wherein data values of the first sequence of values are each formed by 

6 concatenating a portion of each respective addressing value of the first vector of 

7 addressing values to a respective one of a consecutive sequence of integer numbers. 

1 7. The method of claim 1 , wherein for the loading of the third vector register 

2 with elements from memory, elements are loaded from locations specified by 

3 addressing values corresponding to bits of the bit vector that indicated a compare and 

4 no elements are loaded from locations specified by addressing values corresponding 

5 to bits of the bit vector that indicated a miscompare. 

1 8. The method of claim 1 , wherein the operations recited therein are executed in 

2 the order recited therein. 

1 9. The method of claim 1, further comprising: 

2 performing a first synchronization operation that ensures that the comparing 

3 the first sequence of values to the second sequence of values to generate the bit 

4 vector representing compares and miscompares effectively completes before the 
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5 loading of the third vector register with elements from memory; and 

6 performing a first synchronization operation that ensures that the storing the 

7 result vector to memory completes before subsequent passes through a loop. 

1 10. A computer-readable medium having instructions stored thereon for causing 

2 a suitably programmed information-processing system to execute a method 

3 comprising: 

4 providing a first vector of addressing values; 

5 providing a second vector of operand values; 

6 storing a first sequence of values to a sequence of addressed locations within 

7 a constrained area of memory, wherein each location's address is based at least in 

8 part on a corresponding one of the addressing values; 

9 reading back from the sequence of addressed locations values resulting from 

10 the storing of the first sequence to obtain a second sequence of values; 

1 1 comparing the first sequence of values to the second sequence of values to 

1 2 generate a bit vector representing compares and miscompares; 

1 3 compressing the second vector of operand values using the bit vector; 

14 using the first vector of addressing values as masked by the bit vector, 

1 5 loading a third vector register with elements from memory; 

16 performing an arithmetic-logical operation using values from the third vector 

17 register and the compressed second vector of operand values to generate a result 

1 8 vector; and 

19 using the first vector of addressing values as masked by the bit vector, storing 

20 the result vector to memory. 

1 11. A computerized method comprising: 

2 loading a first vector register with addressing values; 

3 loading a second vector register with operand values; 

4 storing a first sequence of values to a sequence of addressed locations within 

5 a constrained area of memory, wherein each one of these location's addresses in the 
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6 constrained area of memory is based at least in part on a subset of bits of a 

7 corresponding one of the addressing values; 

8 reading back from the sequence of addressed locations values resulting from 

9 the storing of the first sequence to obtain a second sequence of values; 

10 comparing the first sequence of values to the second sequence of values; 

1 1 selectively combining, with an arithmetic-logical operation, certain elements 

12 of the second vector of operand values based on results of the comparing; 

1 3 using at least some of the first vector register of addressing values, loading a 

14 third vector register with elements from memory; 

1 5 performing the arithmetic-logical operation using values from the third vector 

16 register and the combined second vector of operand values to generate a result 

1 7 vector; and 

18 using the at least some of the first vector register of addressing values, storing 

1 9 the result vector to memory. 

1 12. The method of claim 1 1 , wherein addresses of the elements from memory are 

2 calculated by adding each respective addressing value to a base address. 

1 13. The method of claim 11, wherein addresses of the elements from memory are 

2 calculated by performing a signed-addition operation of each respective addressing 

3 value to a base address of an object in memory. 

1 14. The method of claim 11, wherein the arithmetic-logical operation is an 

2 addition operation that produces at least one element of the result vector as a 

3 summation of an element of the loaded third vector register and a plurality of 

4 respective elements of the original second vector of operand values corresponding to 

5 elements of the first vector register of addressing values having identical values. 

1 15. The method of claim 1 1 , wherein address values for the sequence of 

2 addressed locations within the constrained area of memory are each calculated using 

3 a truncated portion of each respective addressing value of the first vector register of 
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1 16. The method of claim 1 5, wherein data values of the first sequence of values 

2 are each formed by concatenating a portion of each respective addressing value of 

3 the first vector register of addressing values to a respective one of a sequence of 

4 numbers. 

1 17. The method of claim 1 1 , wherein the constrained area contains 2 N 

2 consecutive addresses, wherein address values for the sequence of addressed 

3 locations within the constrained area of memory are each calculated using an N-bit 

4 value derived from each respective addressing value of the first vector register of 

5 addressing values, and wherein data values of the first sequence of values are each 

6 formed by concatenating a portion of each respective addressing value of the first 

7 vector register of addressing values to a respective one of a consecutive sequence of 

8 integer numbers. 

1 1 8. The method of claim 1 1 , wherein for the loading of the third vector register 

2 with elements from memory, elements are loaded from locations specified by 

3 addressing values corresponding to indications that indicated compares and no 

4 elements are loaded from locations specified by addressing values corresponding to 

5 indications that indicated miscompares. 

1 19. A computer-readable medium having instructions stored thereon for causing 

2 a suitably programmed information-processing system to execute the method of 

3 claim 1 1 . 

1 20. The method of claim 1 1 , 

2 wherein the constrained area contains 2 N consecutive addresses, 

3 wherein address values for the sequence of addressed locations within the 

4 constrained area of memory are each calculated using an N-bit value derived from 

5 each respective addressing value of the first vector register of addressing values, 
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6 wherein data values of the first sequence of values are each formed by 

7 combining at least a portion of each respective addressing value of the first vector 

8 register of addressing values to a respective one of a consecutive sequence of integer 

9 numbers, 

10 wherein for the loading of the third vector register with elements from 

1 1 memory, elements are loaded from locations specified by addressing values 

12 corresponding to indications that indicated compares and no elements are loaded 

13 from locations specified by addressing values corresponding to indications that 

1 4 indicated miscompares, 

1 5 wherein addresses of the elements from memory are calculated by adding 

16 each respective addressing value to a base address, 

17 wherein the arithmetic-logical operation is a floating-point addition operation 

1 8 that produces at least one element of the result vector as an ordered-operation 

19 floating point summation of an element of the loaded third vector register and a 

20 plurality of respective elements of the original second vector of operand values 

21 corresponding to elements of the first vector register of addressing values having 

22 identical values, and 

23 wherein for the storing of the result vector of elements to memory, elements 

24 are stored to locations specified by addressing values corresponding to indications 

25 that indicated compares and no elements are stored to locations specified by 

26 addressing values corresponding to indications that indicated miscompares. 

1 21. A system comprising: 

2 a first vector processor having: 

3 a first vector register having addressing values; 

4 a second vector register having operand values; 

5 a third vector register; 

6 a bit vector register; 

7 circuitry that selectively stores a first sequence of values to a 

8 sequence of addressed locations within a constrained area of memory, 
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9 wherein each location's address is based at least in part on a 

1 0 corresponding one of the addressing values; 

1 1 circuitry that selectively loads, from the sequence of addressed 

12 locations, values resulting from the stores of the first sequence to obtain a 

1 3 second sequence of values; 

14 circuitry that selectively compares the first sequence of values to the 

1 5 second sequence of values to generate bit values into the bit vector 

1 6 register representing compares and miscompares; 

1 7 circuitry that selectively compresses the second vector of operand 

1 8 values using the values in the bit vector register; 

19 circuitry that selectively loads the third vector register with elements 

20 from memory addresses generated from the first vector register of 

21 addressing values as masked by the bit vector register; 

22 circuitry that selectively performs an arithmetic-logical operation on 

23 corresponding values from the third vector register and the compressed 

24 second vector of operand values to generate values of a result vector; and; 

25 circuitry that selectively stores the result vector to memory. 

1 22. The system of claim 2 1 , further comprising 

2 circuitry to calculate addresses of the elements in memory by adding each 

3 respective addressing value to a base address value. 

1 23 . The system of claim 2 1 , wherein the arithmetic-logical operation is an 

2 addition operation that produces at least one element of the result vector as a 

3 summation of an element of the loaded third vector register and a plurality of 

4 respective elements of the original second vector of operand values corresponding to 

5 elements of the first vector register of addressing values that had identical values. 

1 24. The system of claim 2 1 , further comprising 

2 circuitry to calculate address values for the sequence of addressed locations 

3 within the constrained area of memory using a truncated portion of each respective 
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addressing value of the first vector register of addressing values. 



1 25. The system of claim 24, further comprising 

2 circuitry to generate data values of the first sequence of values by joining a 

3 portion of each respective addressing value of the first vector register of addressing 

4 values to a respective one of a sequence of numbers. 

1 26. The system of claim 2 1 , further comprising 

2 circuitry to generate address values of the sequence of addressed locations 

3 within the constrained area of memory by adding a base address to an N-bit portion 

4 of each respective addressing value of the first vector register of addressing values; 

5 and 

6 circuitry to generate data values of the first sequence of values by combining 

7 a portion of each respective addressing value of the first vector register of addressing 

8 values with a respective one of a consecutive sequence of integer numbers. 

1 27. The system of claim 21 , wherein the circuitry that selectively loads the third 

2 vector register with elements from memory only loads element from locations 

3 specified by addressing values corresponding to bits of the bit vector that indicated a 

4 compare. 

1 28. The system of claim 2 1 , further comprising: 

2 synchronization circuitry that ensures that the comparing the first sequence of 

3 values to the second sequence of values to generate the bit vector representing 

4 compares and miscompares effectively completes before the loading of the third 

5 vector register with elements from memory, and that ensures that the storing the 

6 result vector to memory completes before subsequent passes through a loop. 

7 29. The system of claim 2 1 , further comprising: 

8 a second vector processor having: 

9 a first vector register having addressing values; 
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1 0 a second vector register having operand values; 

1 1 a third vector register; 

12 a bit vector register; 

1 3 circuitry that selectively stores a first sequence of values to a 

14 sequence of addressed locations within a constrained area of memory, 

15 wherein each location's address is based at least in part on a 

1 6 corresponding one of the addressing values; 

1 7 circuitry that selectively loads, from the sequence of addressed 

18 locations, values resulting from the stores of the first sequence to obtain a 

1 9 second sequence of values; 

20 circuitry that selectively compares the first sequence of values to the 

21 second sequence of values to generate bit values into the bit vector 

22 register representing compares and miscompares; 

23 circuitry that selectively compresses the second vector of operand 

24 values using the values in the bit vector register; 

25 circuitry that selectively loads the third vector register with elements 

26 from memory addresses generated from the first vector register of 

27 addressing values as masked by the bit vector register; 

28 circuitry that selectively performs an arithmetic-logical operation on 

29 corresponding values from the third vector register and the compressed 

30 second vector of operand values to generate values of a result vector; and; 

3 1 circuitry that selectively stores the result vector to memory; and 

32 synchronization circuitry that ensures that the comparing the first sequence of 

33 values to the second sequence of values to generate the bit vector representing 

34 compares and miscompares effectively completes in both the first and second vector 

35 processors before the loading of the third vector register with elements from memory 

36 in either processor, and that ensures that the storing the result vector to memory 

37 completes before subsequent passes through a loop. 
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3 a second vector register; 

4 a third vector register; 

5 a bit vector register; 

6 means for loading the first vector register with addressing values; 

7 means for loading the second vector register with operand values; 

8 means for storing a first sequence of values to a sequence of addressed 

9 locations within a constrained area of memory, wherein each one of these location's 

1 0 addresses in the constrained area of memory is based at least in part on a subset of 

1 1 bits of a corresponding one of the addressing values; 

12 means for loading from the sequence of addressed locations values resulting 

13 from the storing of the first sequence to obtain a second sequence of values; 

14 means for comparing the first sequence of values to the second sequence of 

15 values; 

16 means for selectively combining, with an arithmetic-logical operation, certain 

1 7 elements of the second vector of operand values based on results of the comparing; 

18 means for loading a third vector register with elements from memory address 

19 locations generated using at least some of the first vector register of addressing 

20 values; 

21 means for performing the arithmetic-logical operation using values from the 

22 third vector register and the combined second vector of operand values to generate a 

23 result vector; and 

24 means for storing the result vector to memory. 

1 31. A system comprising: 

2 a first vector register that can be loaded with addressing values; 

3 a second vector register that can be loaded with operand values; 

4 a third vector register that can be loaded with operand values from memory 

5 locations indirectly addressed using the addressing values from the first vector 

6 register; 

7 a circuit that determines element addresses of the first vector register that 

8 have a value that duplicates a value in another element address; 
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9 a circuit that selectively adds certain elements of the second vector of 

10 operand values based on the element addresses the duplicated values; 

1 1 a circuit that uses indirect addressing to selectively load the third vector 

12 register with elements from memory; 

13 a circuit that selectively adds values from the third vector register and the 

14 second vector of operand values to generate a result vector; and 

1 5 a circuit that selectively stores the result vector to memory using indirect 

16 addressing. 

1 32. The system of claim 3 1 , further comprising: 

2 an adder that generates addresses of the elements from memory by adding 

3 each respective addressing value to a base address. 

1 33. The system of claim 3 1 , further comprising: 

2 an adder that generates addresses of the elements from memory by a signed- 

3 addition operation of each respective addressing value to a base address of an object 

4 in memory. 

1 34. The system of claim 3 1 , wherein the circuit that selectively adds certain 

2 elements performs one or more addition operations using those values from a 

3 plurality of respective elements of the original second vector of operand values 

4 corresponding to elements of the first vector register of addressing values having 

5 identical values. 
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